Goto

Collaborating Authors

 springer nature 2021


HieroGlyphTranslator: Automatic Recognition and Translation of Egyptian Hieroglyphs to English

Nasser, Ahmed, Mohamed, Marwan, Sherif, Alaa, Mahmoud, Basmala, Yehia, Shereen, Saad, Asmaa, El-Rahmany, Mariam S., Mohamed, Ensaf H.

arXiv.org Artificial Intelligence

Egyptian hieroglyphs, the ancient Egyptian writing system, are composed entirely of drawings. Translating these glyphs into English poses various challenges, including the fact that a single glyph can have multiple meanings. Deep learning translation applications are evolving rapidly, producing remarkable results that significantly impact our lives. In this research, we propose a method for the automatic recognition and translation of ancient Egyptian hieroglyphs from images to English. This study utilized two datasets for classification and translation: the Morris Franken dataset and the EgyptianTranslation dataset. Our approach is divided into three stages: segmentation (using Contour and Detectron2), mapping symbols to Gardiner codes, and translation (using the CNN model). The model achieved a BLEU score of 42.2, a significant result compared to previous research.


Transfer in Reinforcement Learning via Regret Bounds for Learning Agents

Tuynman, Adrienne, Ortner, Ronald

arXiv.org Artificial Intelligence

We present an approach for the quantification of the usefulness of transfer in reinforcement learning via regret bounds for a multi-agent setting. Considering a number of $\aleph$ agents operating in the same Markov decision process, however possibly with different reward functions, we consider the regret each agent suffers with respect to an optimal policy maximizing her average reward. We show that when the agents share their observations the total regret of all agents is smaller by a factor of $\sqrt{\aleph}$ compared to the case when each agent has to rely on the information collected by herself. This result demonstrates how considering the regret in multi-agent settings can provide theoretical bounds on the benefit of sharing observations in transfer learning.


MicroAUNet: Boundary-Enhanced Multi-scale Fusion with Knowledge Distillation for Colonoscopy Polyp Image Segmentation

Wang, Ziyi, Zhang, Yuanmei, Esrafilzadeh, Dorna, Jalili, Ali R., Xiang, Suncheng

arXiv.org Artificial Intelligence

Early and accurate segmentation of colorectal polyps is critical for reducing colorectal cancer mortality, which has been extensively explored by academia and industry. However, current deep learning-based polyp segmentation models either compromise clinical decision-making by providing ambiguous polyp margins in segmentation outputs or rely on heavy architectures with high computational complexity, resulting in insufficient inference speeds for real-time colorectal endoscopic applications. To address this problem, we propose MicroAUNet, a light-weighted attention-based segmentation network that combines depthwise-separable dilated convolutions with a single-path, parameter-shared channel-spatial attention block to strengthen multi-scale boundary features. On the basis of it, a progressive two-stage knowledge-distillation scheme is introduced to transfer semantic and boundary cues from a high-capacity teacher. Extensive experiments on benchmarks also demonstrate the state-of-the-art accuracy under extremely low model complexity, indicating that MicroAUNet is suitable for real-time clinical polyp segmentation. The code is publicly available at https://github.com/JeremyXSC/MicroAUNet.


Breaking the Euclidean Barrier: Hyperboloid-Based Biological Sequence Analysis

Ali, Sarwan, Mansoor, Haris, Patterson, Murray

arXiv.org Artificial Intelligence

Genomic sequence analysis plays a crucial role in various scientific and medical domains. Traditional machine-learning approaches often struggle to capture the complex relationships and hierarchical structures of sequence data when working in high-dimensional Euclidean spaces. This limitation hinders accurate sequence classification and similarity measurement. To address these challenges, this research proposes a method to transform the feature representation of biological sequences into the hyperboloid space. By applying a transformation, the sequences are mapped onto the hyperboloid, preserving their inherent structural information. Once the sequences are represented in the hyperboloid space, a kernel matrix is computed based on the hyperboloid features. The kernel matrix captures the pairwise similarities between sequences, enabling more effective analysis of biological sequence relationships. This approach leverages the inner product of the hyperboloid feature vectors to measure the similarity between pairs of sequences. The experimental evaluation of the proposed approach demonstrates its efficacy in capturing important sequence correlations and improving classification accuracy.


Morning commute in congested urban rail transit system: A macroscopic model for equilibrium distribution of passenger arrivals

Zhang, Jiahua, Wada, Kentaro, Oguchi, Takashi

arXiv.org Artificial Intelligence

In many metropolises, the congestion and delay of rail transit have brought about tremendous psychological stress to commuters and considerable economic loss to the society. For example, according to a report by the Ministry of Land, Infrastructure, Transport and Tourism of Japan, on an average, train delays (more than 5 min) were observed for 45 railway lines in the Tokyo metropolitan area in 11.7 days of 20 weekdays in a month, and more than half of the short delays (within 10 min) were caused by extended dwell time (MLIT, 2020). Kariyazaki et al (2015) estimated that in Japan, train delays resulted in social cost in excess of 1.8 billion dollars per year. In a high-frequency operated rail transit system, when a train delay occurs because of either an accident or extended dwell time, the subsequent trains are forced to decelerate or stop between stations to maintain a safety clearance, which is a so-called "knock-on delay" on the rail track (Carey and Kwieci nski, 1994). Meanwhile, more passengers are kept waiting on the platform when trains decelerate or stop (because headways of trains are extended), which results in a longer dwell time of trains.


Optimizing Mesh to Improve the Triangular Expansion Algorithm for Computing Visibility Regions

Mikula, Jan, Kulich, Miroslav

arXiv.org Artificial Intelligence

This paper addresses the problem of improving the query performance of the triangular expansion algorithm (TEA) for computing visibility regions by finding the most advantageous instance of the triangular mesh, the preprocessing structure. The TEA recursively traverses the mesh while keeping track of the visible region, the set of all points visible from a query point in a polygonal world. We show that the measured query time is approximately proportional to the number of triangle edge expansions during the mesh traversal. We propose a new type of triangular mesh that minimizes the expected number of expansions assuming the query points are drawn from a known probability distribution. We design a heuristic method to approximate the mesh and evaluate the approach on many challenging instances that resemble real-world environments. The proposed mesh improves the mean query times by 12-16% compared to the reference constrained Delaunay triangulation. The approach is suitable to boost offline applications that require computing millions of queries without addressing the preprocessing time. The implementation is publicly available to replicate our experiments and serve the community.


GOLFS: Feature Selection via Combining Both Global and Local Information for High Dimensional Clustering

Xing, Zhaoyu, Wan, Yang, Wen, Juan, Zhong, Wei

arXiv.org Machine Learning

It is important to identify the discriminative features for high dimensional clustering. However, due to the lack of cluster labels, the regularization methods developed for supervised feature selection can not be directly applied. To learn the pseudo labels and select the discriminative features simultaneously, we propose a new unsupervised feature selection method, named GlObal and Local information combined Feature Selection (GOLFS), for high dimensional clustering problems. The GOLFS algorithm combines both local geometric structure via manifold learning and global correlation structure of samples via regularized self-representation to select the discriminative features. The combination improves the accuracy of both feature selection and clustering by exploiting more comprehensive information. In addition, an iterative algorithm is proposed to solve the optimization problem and the convergency is proved. Simulations and two real data applications demonstrate the excellent finite-sample performance of GOLFS on both feature selection and clustering.


Personalized Federated Learning via Dual-Prompt Optimization and Cross Fusion

Zhang, Yuguang, Guo, Kuangpu, Lu, Zhihe, Wang, Yunbo, Liang, Jian

arXiv.org Artificial Intelligence

Federated learning (FL) enables collaborative model training across decentralized clients without sharing local data, but is challenged by heterogeneity in data, computation, and communication. Pretrained vision-language models (VLMs), with their strong generalization and lightweight tuning via prompts, offer a promising solution. However, existing federated prompt-learning methods rely only on text prompts and overlook joint label-domain distribution shifts. In this paper, we propose a personalized FL framework based on dual-prompt learning and cross fusion, termed pFedDC. Specifically, each client maintains both global and local prompts across vision and language modalities: global prompts capture common knowledge shared across the federation, while local prompts encode client-specific semantics and domain characteristics. Meanwhile, a cross-fusion module is designed to adaptively integrate prompts from different levels, enabling the model to generate personalized representations aligned with each client's unique data distribution. Extensive experiments across nine datasets with various types of heterogeneity show that pFedDC consistently outperforms state-of-the-art methods.


Model compression using knowledge distillation with integrated gradients

Hernandez, David E., Chang, Jose, Nordling, Torbjörn E. M.

arXiv.org Artificial Intelligence

Model compression is critical for deploying deep learning models on resource-constrained devices. We introduce a novel method enhancing knowledge distillation with integrated gradients (IG) as a data augmentation strategy. Our approach overlays IG maps onto input images during training, providing student models with deeper insights into teacher models' decision-making processes. Extensive evaluation on CIFAR-10 demonstrates that our IG-augmented knowledge distillation achieves 92.6% testing accuracy with a 4.1x compression factor-a significant 1.1 percentage point improvement ($p<0.001$) over non-distilled models (91.5%). This compression reduces inference time from 140 ms to 13 ms. Our method precomputes IG maps before training, transforming substantial runtime costs into a one-time preprocessing step. Our comprehensive experiments include: (1) comparisons with attention transfer, revealing complementary benefits when combined with our approach; (2) Monte Carlo simulations confirming statistical robustness; (3) systematic evaluation of compression factor versus accuracy trade-offs across a wide range (2.2x-1122x); and (4) validation on an ImageNet subset aligned with CIFAR-10 classes, demonstrating generalisability beyond the initial dataset. These extensive ablation studies confirm that IG-based knowledge distillation consistently outperforms conventional approaches across varied architectures and compression ratios. Our results establish this framework as a viable compression technique for real-world deployment on edge devices while maintaining competitive accuracy.


Alice and the Caterpillar: A more descriptive null model for assessing data mining results

Preti, Giulia, Morales, Gianmarco De Francisci, Riondato, Matteo

arXiv.org Artificial Intelligence

We introduce novel null models for assessing the results obtained from observed binary transactional and sequence datasets, using statistical hypothesis testing. Our null models maintain more properties of the observed dataset than existing ones. Specifically, they preserve the Bipartite Joint Degree Matrix of the bipartite (multi-)graph corresponding to the dataset, which ensures that the number of caterpillars, i.e., paths of length three, is preserved, in addition to other properties considered by other models. We describe Alice, a suite of Markov chain Monte Carlo algorithms for sampling datasets from our null models, based on a carefully defined set of states and efficient operations to move between them. The results of our experimental evaluation show that Alice mixes fast and scales well, and that our null model finds different significant results than ones previously considered in the literature.